Reddit is a great place for sports fans. News, gamethreads, analysis, highlights, and much more are posted and broken down by fellow fans. It is a huge community-driven collection that satisfies any sporting fan needs, and serves as an excellent supplement to ESPN, Bleacher Report, and the like. The /r/NFL subreddit is a perfect example of a great community.
Like nearly every popular sports website out there, Tuesday morning means NFL power rankings, where analysts rank teams 1-32 based on how good they are. Whether it be ESPN’s ‘power panel’ of rankers, 538’s ELO ranking methodology, or Bleacher Report’s one man show, power rankings are a great place to gauge how your team is doing.
But what about the community? Well, /r/NFL has their method too. Each week, 32 representatives, one from each team’s fanbase, create their own power rankings which are aggregated and posted to the community, with each representative writing a little blurb on their feelings of the team week by week. These rankings are then discussed within the community, where fans are free to cheer, complain, and sometimes all out argue over their team’s rankings. Even the rankers are available to defend their sometimes outlandish decisions, like why /u/whirledworld ranked the Jaguars #3 in Week 5.
Here are the Week 9, 2017 /r/NFL power rankings.
Now, while this is a great way to view the /r/NFL power rankings, there’s so much going on behind the scenes that we aren’t seeing from this write-up alone.
library(tidyverse) # Reading, cleaning, visualizing
library(rvest) # Reading
library(reshape) # Melting
library(plyr) # Summary stats
library(directlabels) # Viz help
library(scales) # Viz help
library(gganimate) # Animation
library(animation) # Animation
library(magick) # Animation
library(knitr)
Luckily for us, /u/NFLPowerRankers keeps track of how each ranker votes for each week in a google doc, located here, which is easily read with a package like rvest (from tidyverse). With the html page read, we have to extract the table from the document. Luckily enough in this case, we can just grab the sheets and parse them into tables. The result is a list of 18 data.frames, one for weeks 0-17 of the NFL season.
ranks_2017_url <-
read_html(
"https://docs.google.com/spreadsheets/d/e/2PACX-1vSdFW_RZwS8TAMbbVr7rRkEv5kaRduhU3CiJ1MEkIHUe-X14NykrW9IM5Rw3VE98lg_ZjYhAF-01zKO/pubhtml"
)
ranks_2017_list <- ranks_2017_url %>%
html_nodes("table") %>%
html_table()
ranks_2017_list <- ranks_2017_list[5:22] #Keep only the tables that we need
It’s probably worth double checking the number of tables and the structure of one
length(ranks_2017_list) # Number of tables
glimpse(ranks_2017_list[[1]]) # Structure
## [1] 18
## Observations: 34
## Variables: 44
## $ `` <int> 1, NA, 2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12, 13, 14, 15, 16, ...
## $ `` <int> NA, NA, 1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12, 13, 14, 15, ...
## $ `` <lgl> NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA,...
## $ `` <chr> "sknich49ers", "", "Patriots", "Falcons", "Steelers", "Pack...
## $ `` <chr> "pygreg Bears", "", "Patriots", "Seahawks", "Steelers", "Pa...
## $ `` <chr> "ArbysguyBengals", "", "Patriots", "Packers", "Seahawks", "...
## $ `` <chr> "PenguinProphetBills", "", "Patriots", "Falcons", "Seahawks...
## $ `` <chr> "BlindManBaldwinBroncos", "", "Patriots", "Falcons", "Seaha...
## $ `` <chr> "ThaddeusJPBrowns", "", "Patriots", "Falcons", "Packers", "...
## $ `` <chr> "LandsdownestreetBuccaneers", "", "Patriots", "Seahawks", "...
## $ `` <chr> "TangerineDieselCardinals", "", "Patriots", "Packers", "Ste...
## $ `` <int> NA, NA, 1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12, 13, 14, 15, ...
## $ `` <chr> "milkchococurryChargers", "", "Patriots", "Falcons", "Seaha...
## $ `` <chr> "IIHURRlCANEIIChiefs", "", "Patriots", "Falcons", "Steelers...
## $ `` <chr> "morespikesColts", "", "Patriots", "Cowboys", "Seahawks", "...
## $ `` <chr> "staub81Cowboys", "", "Patriots", "Falcons", "Packers", "St...
## $ `` <chr> "yoda133113Dolphins", "", "Patriots", "Falcons", "Steelers"...
## $ `` <chr> "famouslastwordsEagles", "", "Patriots", "Falcons", "Seahaw...
## $ `` <chr> "wannaknowmynameFalcons", "", "Patriots", "Falcons", "Packe...
## $ `` <chr> "SexterminatorGiants", "", "Patriots", "Packers", "Steelers...
## $ `` <int> NA, NA, 1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12, 13, 14, 15, ...
## $ `` <chr> "preludeoflightJaguars", "", "Patriots", "Packers", "Falcon...
## $ `` <chr> "nickmangoldsbeardJets", "", "Patriots", "Seahawks", "Cowbo...
## $ `` <chr> "sosuhmeLions", "", "Patriots", "Falcons", "Giants", "Steel...
## $ `` <chr> "analogWeaponPackers", "", "Patriots", "Steelers", "Packers...
## $ `` <chr> "allsecretsknownPanthers", "", "Patriots", "Packers", "Seah...
## $ `` <chr> "SPACE_LAWYERPatriots", "", "Patriots", "Falcons", "Packers...
## $ `` <chr> "newBreedRaiders", "", "Patriots", "Falcons", "Packers", "S...
## $ `` <chr> "One_Half_HamsterRams", "", "Patriots", "Falcons", "Steeler...
## $ `` <int> NA, NA, 1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12, 13, 14, 15, ...
## $ `` <chr> "I_have_no_throwawayRavens", "", "Patriots", "Falcons", "Se...
## $ `` <chr> "ThatChrisDodgeRedskins", "", "Patriots", "Falcons", "Cowbo...
## $ `` <chr> "Naly_DSaints", "", "Patriots", "Seahawks", "Steelers", "Pa...
## $ `` <chr> "iltat_workSeahawks", "", "Seahawks", "Patriots", "Falcons"...
## $ `` <chr> "smacksawSteelers", "", "Patriots", "Packers", "Falcons", "...
## $ `` <chr> "SkarmotasticTexans", "", "Patriots", "Falcons", "Steelers"...
## $ `` <chr> "philo13181Titans", "", "Patriots", "Falcons", "Packers", "...
## $ `` <chr> "whirledworldVikings", "", "Steelers", "Patriots", "Seahawk...
## $ `` <int> NA, NA, 1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12, 13, 14, 15, ...
## $ `` <lgl> NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA,...
## $ `` <chr> "Teams", "", "49ers", "Bears", "Bengals", "Bills", "Broncos...
## $ `` <chr> "Median", "", "30", "27", "20", "26", "10.5", "30", "14", "...
## $ `` <chr> "Mean", "", "29.63", "26.84", "19.41", "26.16", "11.81", "3...
## $ `` <chr> "StandardDeviation", "", "1.431", "1.679", "3.613", "2.152"...
The tables are a little raw, so let’s clean them up a bit.
ranks_2017_list_clean <- lapply(ranks_2017_list, function(x) {
x <- x[!is.na(x[,1]),]
colnames(x) <- x[1, ] # First row as names
x <- x[,-c(1,3)]
x <- x[, grep("^(NA)", names(x), value = TRUE, invert = TRUE)] # Drop NA columns
x <- x[, !(colnames(x) %in% c("Teams", "Median", "Mean", "StandardDeviation"))]
x <- x[-1,]
x <- x[,apply(x,2,is.unsorted)] # Drop columns that are sorted
x <- cbind("Rank"=1:32,x)
})
# Week numbers
names(ranks_2017_list_clean) <- 0:17
ranks_2017_list_clean[[1]]
| Rank | sknich49ers | pygreg Bears | ArbysguyBengals | PenguinProphetBills |
|---|---|---|---|---|
| 1 | Patriots | Patriots | Patriots | Patriots |
| 2 | Falcons | Seahawks | Packers | Falcons |
| 3 | Steelers | Steelers | Seahawks | Seahawks |
| 4 | Packers | Packers | Cowboys | Steelers |
| 5 | Cowboys | Falcons | Steelers | Packers |
| 6 | Chiefs | Raiders | Chiefs | Giants |
| 7 | Seahawks | Chiefs | Falcons | Cowboys |
| 8 | Raiders | Cowboys | Raiders | Raiders |
| 9 | Broncos | Broncos | Broncos | Chiefs |
| 10 | Panthers | Cardinals | Giants | Broncos |
| 11 | Giants | Giants | Cardinals | Buccaneers |
| 12 | Buccaneers | Titans | Panthers | Titans |
| 13 | Titans | Panthers | Titans | Lions |
| 14 | Cardinals | Eagles | Bengals | Cardinals |
| 15 | Lions | Texans | Vikings | Texans |
| 16 | Texans | Vikings | Lions | Ravens |
| 17 | Bengals | Buccaneers | Texans | Bengals |
| 18 | Vikings | Redskins | Redskins | Redskins |
| 19 | Redskins | Ravens | Eagles | Eagles |
| 20 | Eagles | Colts | Dolphins | Panthers |
| 21 | Ravens | Lions | Ravens | Dolphins |
| 22 | Dolphins | Saints | Buccaneers | Saints |
| 23 | Colts | Bengals | Saints | Vikings |
| 24 | Saints | Chargers | Colts | Chargers |
| 25 | Chargers | Dolphins | Chargers | Bills |
| 26 | Bills | Bears | Bills | Rams |
| 27 | 49ers | Bills | Rams | Bears |
| 28 | Bears | Rams | Bears | Jets |
| 29 | Jaguars | Jaguars | Browns | Colts |
| 30 | Browns | Browns | Jaguars | Jaguars |
| 31 | Rams | 49ers | 49ers | 49ers |
| 32 | Jets | Jets | Jets | Browns |
Worth noting above that the is.unsorted call will drop any columns that are sorted. Occasionally, if a /r/NFL ranker did not report, their column of rankings would be in alphabetical order instead of having ‘–’ values. This will take care of that issue.
Now we have our 18 weeks worth of power rankings. It’s easy to look at in this format, but unfortunately not great for analysis at all.
The melt function from the reshape package will turn the tables into ‘tidy’ data: each observation has a row, each variable has a column. The following code will melt the data, as well as drop NA values denoted by ‘–’.
df_2017 <- melt(ranks_2017_list_clean, id = 'Rank', variable_name = 'Ranker')
names(df_2017) <- c('Rank', 'Ranker', 'Team', 'Week')
df_2017$Week <- as.numeric(df_2017$Week)
# Drop NA values
df_2017[df_2017 == '--'] <- NA
df_2017 <- df_2017 %>% drop_na()
head(df_2017)
| Rank | Ranker | Team | Week |
|---|---|---|---|
| 1 | sknich49ers | Patriots | 0 |
| 2 | sknich49ers | Falcons | 0 |
| 3 | sknich49ers | Steelers | 0 |
| 4 | sknich49ers | Packers | 0 |
| 5 | sknich49ers | Cowboys | 0 |
| 6 | sknich49ers | Chiefs | 0 |
Each team generally has a primary and secondary color (some teams may have more than two), and using them will look a lot better than using default colors for visualizing. I manually keyed the primary and secondary hex codes using Team Color Codes as a basis. I also added useful things like Abbreviation, Division, and Location just in case I needed them.
nfl_color_codes <- read_csv("nfl_color_codes.csv")
df_2017_col <- merge(nfl_color_codes[,c('Team','NFL_color','NFL_color2','Div')],df_2017)
df_2017_col$Conf <- substring(df_2017_col$Div,1,3)
head(df_2017_col[with(df_2017_col,order(Week,Ranker,Rank)),])
| Team | NFL_color | NFL_color2 | Div | Rank | Ranker | Week | Conf |
|---|---|---|---|---|---|---|---|
| Patriots | #0C2340 | #C8102E | AFCE | 1 | sknich49ers | 0 | AFC |
| Falcons | #A6192E | #101820 | NFCS | 2 | sknich49ers | 0 | NFC |
| Steelers | #FFB81C | #101820 | AFCN | 3 | sknich49ers | 0 | AFC |
| Packers | #175E33 | #FFB81C | NFCN | 4 | sknich49ers | 0 | NFC |
| Cowboys | #7F9695 | #041E42 | NFCE | 5 | sknich49ers | 0 | NFC |
| Chiefs | #C8102E | #FFB81C | AFCW | 6 | sknich49ers | 0 | AFC |
Excellent, now we can begin analysis!
Before we can do any summary work, we should filter out the weeks that there are no power rankings yet. We can enter the week number below to filter out the unneeded data.
weekno <- 9
df.weekno <- df_2017_col[df_2017_col$Week <= weekno,]
Now we can grab the median, average, and standard deviation of ranks for each team by weeks. Additionally, we can grab the ranks of each team, numbered 1 through 32.
The ranking methodology here is to order by median first, then use the average as a tiebreaker.
sumstats_byweek_byteam <- ddply(df.weekno, .(Week,Team,Div,NFL_color,NFL_color2,Conf), summarize,
med = median(Rank),
avg = round(mean(Rank),2),
sd=round(sd(Rank),2))
# Order
sumstats_byweek_byteam <- sumstats_byweek_byteam[
with(sumstats_byweek_byteam,order(Week,med,avg)),
]
# Rank
sumstats_byweek_byteam$Rank <- 1:32
head(sumstats_byweek_byteam)
| Week | Team | Div | NFL_color | NFL_color2 | Conf | med | avg | sd | Rank |
|---|---|---|---|---|---|---|---|---|---|
| 0 | Patriots | AFCE | #0C2340 | #C8102E | AFC | 1 | 1.06 | 0.25 | 1 |
| 0 | Falcons | NFCS | #A6192E | #101820 | NFC | 2 | 3.22 | 1.81 | 2 |
| 0 | Packers | NFCN | #175E33 | #FFB81C | NFC | 4 | 4.09 | 1.80 | 3 |
| 0 | Steelers | AFCN | #FFB81C | #101820 | AFC | 4 | 4.50 | 1.83 | 4 |
| 0 | Seahawks | NFCW | #4DFF00 | #001433 | NFC | 5 | 4.50 | 1.92 | 5 |
| 0 | Cowboys | NFCE | #7F9695 | #041E42 | NFC | 5 | 5.50 | 1.76 | 6 |
We can also grab the same stats by division by week
sumstats_byweek_bydiv <- ddply(df.weekno, .(Week,Div,Conf), summarize,
med = median(Rank),
avg = round(mean(Rank),2),
sd=round(sd(Rank),2))
head(sumstats_byweek_bydiv)
| Week | Div | Conf | med | avg | sd |
|---|---|---|---|---|---|
| 0 | AFCE | AFC | 25.0 | 19.97 | 11.74 |
| 0 | AFCN | AFC | 19.0 | 17.88 | 9.54 |
| 0 | AFCS | AFC | 19.5 | 19.99 | 7.28 |
| 0 | AFCW | AFC | 9.0 | 12.51 | 7.11 |
| 0 | NFCE | NFC | 13.0 | 12.67 | 6.16 |
| 0 | NFCN | NFC | 17.0 | 16.23 | 8.68 |
Now that we have our summary stats, we can look at some interesting things like rankings over time, both by team and by division.
Let’s first take a look at some line plots of true rankings over time, to get an idea of how teams are faring as the season progresses.
l1 <- ggplot(sumstats_byweek_byteam,aes(x=Week,y=Rank,group=Team,color=NFL_color))+
theme_classic()+
geom_line()+
geom_point(aes(color=NFL_color))+
scale_y_reverse()+
scale_color_identity()+
scale_x_continuous(breaks=pretty_breaks(n=weekno+1)) +
geom_dl(aes(label=Team),method='last.qp',cex=0.8)+
ggtitle(paste0('/r/NFL Power Rankings Through Week ', weekno))
ggsave(paste0('img/LineGraphWeek',weekno,'.png'),l1,height=7,width=11)
Things look a little crowded here, so let’s facet by division and then do some fun animation
a1 <- ggplot(sumstats_byweek_byteam,aes(x=Week,y=Rank,group=Team,color=NFL_color,frame=Week,cumulative=TRUE))+
theme_classic()+
geom_line()+
geom_point(aes(color=NFL_color))+
scale_y_reverse()+
scale_color_identity()+
scale_x_continuous(breaks=pretty_breaks(n=weekno+1)) +
facet_wrap(~Div,ncol=2)+
ggtitle('/r/NFL Power Rankings Through Week')
ani1 <- gganimate(a1,'img/ani1.gif')
Looks great!
Another thing we could look at is the average divisional strength over time based on our lovely rankers.
p1 <- ggplot(sumstats_byweek_bydiv,aes(x=Week,y=avg,group=Div,color=Conf,frame=Week,shape=Div,cumulative=TRUE))+
theme_classic()+
geom_line(aes(linetype=Div))+geom_point()+
scale_y_reverse()+
scale_x_continuous(breaks=pretty_breaks(n=weekno+1)) +
ggtitle(paste0('Average Divisional Rankings through Week '))
ani2 <- gganimate(p1,'img/DivRank.gif',height=7,width=11)
Now we can look at the box plots of the data, to see exactly the distribution of rankings. We’ll just grab the current week’s data for the moment.
df.week <- df.weekno[df.weekno$Week == weekno,]
sumstats.week <- sumstats_byweek_byteam[sumstats_byweek_byteam$Week == weekno,]
week.rank <- sumstats.week$Team
b1 <- ggplot(df.week,aes(x=Team,y=Rank,fill=NFL_color))+
theme_classic()+
geom_jitter(aes(color=NFL_color))+ # Plot the points
scale_color_identity()+ # Color the points
stat_boxplot(geom='errorbar')+
geom_boxplot(outlier.shape=NA)+ # Hide the outliers
scale_y_reverse()+
scale_fill_identity()+ # Color the boxplot
scale_x_discrete(limits=week.rank)+ # Order
theme(axis.text.x = element_text(angle = 90,vjust = 0.5 ))+
ggtitle(paste0('/r/NFL Week ', as.character(weekno), ' Rankings'))
ggsave('img/Plainbox.png',b1)
It’s pretty tough to see the medians for some teams, so we’ll color them using the secondary color. In order to do this, we must add the NFL_color2 aesthetic to the boxplot. Unfortunately, this will make the outline and the whiskers of the boxplots the secondary color as well, which isn’t what I’d like to go for here.
So, to get the desired output, we need to add the NFL_color2 aesthetic to our boxplot, grab the build using ggplot_build then use that data to draw our medians in the secondary colors.
r <- ggplot(df.week,aes(x=Team,y=Rank,fill=NFL_color,color=NFL_color2))+
geom_boxplot()+
scale_x_discrete(limits=week.rank)+
scale_y_reverse()+
scale_fill_identity()+
scale_color_identity()
dat <- ggplot_build(r)$data[[1]]
We will use xmin, xmax, and middle variables to draw the median. The colour variable will help us color the lines with the secondary color.
b2 <- b1 + geom_segment(data=dat, aes(x=xmin, xend=xmax, y=-1*middle, yend=-1*middle,color=colour),
inherit.aes = FALSE,
size = 1) +
theme_classic()+
theme(axis.text.x = element_text(angle = 90,vjust = 0.5 ))+
scale_color_identity()
ggsave('img/Prettybox.png',b2)
Beautiful. Now we have team-themed boxplots. We could stop here and admire our masterpiece, but I’m still not satisfied. It’s great that we can see the order of the teams, but including the summary stats would add the cherry on top of this plot.
b3 <- b2 + geom_text(data = sumstats.week, aes(x=Team,y=34,label=Rank),inherit.aes=F,size=2) +
geom_text(data = sumstats.week, aes(x=Team,y=36,label=med),inherit.aes=F,size=2) +
geom_text(data = sumstats.week, aes(x=Team,y=38,label=avg),inherit.aes=F,size=1.75)
b3 <- b3 + scale_y_reverse(labels=c('0','10','20','30','Rank','Med','Avg'),breaks=c(0,10,20,30,34,36,38))
ggsave('img/StatBox.png',b3,height=7,width=11)
Now, we can gather everything together, and write a
for loop to create a boxplot for each week and store them in a list. We can also save each week’s rankings plot to use later.
Now that we have our boxplots for each week, it only makes sense to look at them as the weeks progress through the season.
While we could use something interesting like gganimate as above, unfortunately geom_boxplot does not include a frame aesthetic. Instead, we can use the magick and animate packages to help us stitch our graphs together easier. See here and here for more info.
img <- image_graph(900, 700, res = 96)
datalist <- split(df.weekno,df.weekno$Week)
out <- lapply(datalist,function(data){
sumstats <- ddply(data, .(Week,Team,Div,NFL_color,NFL_color2), summarize,
med = median(Rank),
avg = round(mean(Rank),2),
sd=round(sd(Rank),2))
sumstats <- sumstats[
with(sumstats,order(med,avg)),
]
sumstats$Rank <- 1:32
r <- ggplot(data,aes(x=Team,y=Rank,fill=NFL_color,color=NFL_color2))+
geom_boxplot()+
scale_x_discrete(limits=week.rank)+
scale_y_reverse()+
scale_fill_identity()+
scale_color_identity()
dat <- ggplot_build(r)$data[[1]]
q <- ggplot(data,aes(x=Team,y=Rank,fill=NFL_color))+
theme_classic()+
geom_jitter(aes(color=NFL_color),width = 0.2)+
scale_color_identity()+
stat_boxplot(geom ='errorbar') +
geom_boxplot(outlier.shape = NA)+
theme(axis.text.x = element_text(angle = 90,vjust = 0.5 ))+
scale_x_discrete(limits=week.rank)+
scale_y_reverse()+
scale_fill_identity()+
ggtitle(paste0('/r/NFL Rankings for Week ',data$Week))
q <- q + geom_segment(data=dat, aes(x=xmin, xend=xmax, y=-1*middle, yend=-1*middle,color=colour),
inherit.aes = FALSE,
size = 1.25) +
scale_color_identity()
q <- q + geom_text(data = sumstats, aes(x=Team,y=34,label=Rank),inherit.aes=F,size=2) +
geom_text(data = sumstats, aes(x=Team,y=36,label=med),inherit.aes=F,size=2) +
geom_text(data = sumstats, aes(x=Team,y=38,label=avg),inherit.aes=F,size=1.75) +
theme(axis.text.x = element_text(angle = 90,vjust = 0.5 ))
q <- q + scale_y_reverse(labels=c('0','10','20','30','Rank','Med','Avg'),breaks=c(0,10,20,30,34,36,38))
print(q)
})
dev.off()
## quartz_off_screen
## 2
img <- image_background(image_trim(img), 'white')
animation <- image_animate(img, fps = 1)
image_write(animation,'img/boxani1.gif')
Now we can really see how teams ended up where they are now. Personally, the most interesting thing I see is how far the Packers dropped after Aaron Rodgers got hurt in Week 6 against the Vikings. Poor packers!
ggplot(df.weekno,aes(x=Week,y=Rank,group=Team,color=NFL_color))+
theme_classic()+
geom_jitter(alpha=0.9)+
scale_color_identity()+
scale_y_reverse()+
facet_wrap(~Div,ncol=2)+
geom_smooth(aes(color=NFL_color2))